-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency: always listen for lock state transitions when pushing #112768
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
-- commits
line 2 at r1:
"listen"
-- commits
line 13 at r1:
The most interesting case here is when the pusher (shared lock) doesn't directly conflict with the lock holder (shared lock), but it conflicts with a waiter (exclusive lock) in front of it in the queue, so it pushes the lock holder for deadlock detection purposes. In such cases, that waiter dropping out the queue should permit the pusher to proceed.
-- commits
line 16 at r1:
"the a"
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 368 at r1 (raw file):
// wait-queue without leaving behind a lock. In this case, the request // has a dependency on the conflicting request but not necessarily the // entire conflicting transaction.
I think retaining some mention of "dependency" in the new comments would be good, as this is one way to look at the situation.
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 377 at r1 (raw file):
// such cases, the lock table will simply forget the lock when the // intent is resolved. Note that in such cases, the pusher may still // conflict with the intent and rediscover it -- that's okay.
It would be worth including the additional case I mentioned above.
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 378 at r1 (raw file):
// intent is resolved. Note that in such cases, the pusher may still // conflict with the intent and rediscover it -- that's okay. if timerWaitingState.held {
nit: should we share the error handling logic to avoid duplication? Something like:
var err error
if timerWaitingState.held {
err = w.pushLockTxn(pushCtx, req, timerWaitingState)
} else {
err = w.pushRequestTxn(pushCtx, req, timerWaitingState)
}
// Ignore the context canceled error. If this was for the
// parent context then we'll notice on the next select.
//
// NOTE: we look at pushCtx.Err() and not err to avoid the
// potential for bugs if context cancellation is not
// propagated correctly on some error paths.
if errors.Is(pushCtx.Err(), context.Canceled) {
err = nil
}
return err
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 379 at r1 (raw file):
// conflict with the intent and rediscover it -- that's okay. if timerWaitingState.held { err = w.pushLockTxn(pushCtx, req, timerWaitingState)
This err
is referencing the function's named return variable. The fact that this is in scope is a holdover from 67c6bdb and feels moderately risky. Is there any risk to this aliasing? I think it would be better to un-name this return variable and declare locally scoped err
variables where needed.
pkg/kv/kvserver/concurrency/testdata/concurrency_manager/wait_elsewhere
line 98 at r1 (raw file):
---- # Abort the writing txn. This will cause the blocked request to unblock. Note
Looks like this comment needs an update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR, RFAL!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
The most interesting case here is when the pusher (shared lock) doesn't directly conflict with the lock holder (shared lock), but it conflicts with a waiter (exclusive lock) in front of it in the queue, so it pushes the lock holder for deadlock detection purposes. In such cases, that waiter dropping out the queue should permit the pusher to proceed.
Added it here and to the comment below.
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 368 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think retaining some mention of "dependency" in the new comments would be good, as this is one way to look at the situation.
Good point. Reworded to make this clearer.
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 378 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: should we share the error handling logic to avoid duplication? Something like:
var err error if timerWaitingState.held { err = w.pushLockTxn(pushCtx, req, timerWaitingState) } else { err = w.pushRequestTxn(pushCtx, req, timerWaitingState) } // Ignore the context canceled error. If this was for the // parent context then we'll notice on the next select. // // NOTE: we look at pushCtx.Err() and not err to avoid the // potential for bugs if context cancellation is not // propagated correctly on some error paths. if errors.Is(pushCtx.Err(), context.Canceled) { err = nil } return err
Done.
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 379 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This
err
is referencing the function's named return variable. The fact that this is in scope is a holdover from 67c6bdb and feels moderately risky. Is there any risk to this aliasing? I think it would be better to un-name this return variable and declare locally scopederr
variables where needed.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 379 at r2 (raw file):
// Concretely, a construction like follows: // - holder: shared // - wait-queue: exclusive, shared
nit: odd indentation here.
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 380 at r2 (raw file):
// - holder: shared // - wait-queue: exclusive, shared // In this case, the waiting shared lock request will push the holder.
Let's add one sentence about why the shared lock request is pushing the shared lock holder instead of the exclusive lock request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r=nvanbenschoten
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table_waiter.go
line 380 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's add one sentence about why the shared lock request is pushing the shared lock holder instead of the exclusive lock request.
Done.
🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set. |
95e69bb
to
813e78a
Compare
bors r- |
bors r=nvanbenschoten |
🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set. |
Prior to this patch, the lockTableWaiter would only listen for lock state transitions if it was pushing a transaction while waiting for an unheld lock. The reasoning was that if the lock was held, the pusher would not be able to proceed until the push returned. This isn't quite true -- there's a few cases where the request may no longer conflict with what's being tracked in the lock table: - A waiting request may not conflict with the lock holder, but a waiting request instead. In such cases, it'll push the lock holder even though it is compatible with it. However, if the conflicting waiting request drops out of the lock's wait queues, the pusher may proceed. - The lock may have been rolled back because of savepoints. - The lock may have been forgotten by the lock table (replicated locks are forgotten when they're updated). This patch changes the lockTableWaiter to also listen for state transitions when pushing a held lock's transaction. Cases where the pusher no longer conflicts with the lock state are detected and the push is cancelled. Conveniently, the updates to `resolve_pushed_intents` show the effect of making this change. Fixes cockroachdb#111596 Release note: None
bors r- |
bors r=nvanbenschoten |
Build succeeded: |
blathers backport 23.2 |
Prior to this patch, the lockTableWaiter would only listen for lock state transitions if it was pushing a transaction while waiting for an unheld lock. The reasoning was that if the lock was held, the pusher would not be able to proceed until the push returned. This isn't quite true -- there's a few cases where the request may no longer conflict with what's being tracked in the lock table:
This patch changes the lockTableWaiter to also listen for state transitions when pushing the a held lock's transaction. Cases where the pusher no longer conflicts with the lock state are detected and the push is cancelled.
Conveniently, the updates to
resolve_pushed_intents
show the effect of making this change.Fixes #111596
Release note: None